| GSA | Commercial_LF_coverage | Survey_LF_coverage | Direct_age_data | Age_data_used |
|---|---|---|---|---|
| 1 | IEO LF from ~2001, DCF LF from 2002 onward | MEDITS LF 2002–; 2020 missing (NA in index) | No | All ages from LF slicing (sex-combined VBGF; quarterly/stochastic method under development) |
| 5 | Landings since 1940; LF since ~1980 | BALAR LF 2002–2006; MEDITS LF 2007– | No | BALAR+MEDITS LF sliced to age; commercial LF sliced with same VBGF; large-hake LPUE exploratory only |
| 6 | DCF LF 2002–2024; longer national series | MEDITS LF except 2020 (NA) | No | LF-based catch-at-age; ICATMAR LPUE and sex/length info used diagnostically, not as age data |
| 7 | DCF LF 2002–2024 (France + Spain combined) | MEDITS LF except 2020 (NA, autumn survey) | No | LF-based age compositions; sex-ratio by length available but not yet used for sex-specific age input |
GFCM GSA 1, 5, 6, 7 hake notes
Executive Summary
This is a review assembled over a short period with the help of software that assisted in interrogating documents and databases. The notes attempt to prioritize the most visible strengths, gaps, and near-term risks rather than a full re-analysis. Treat the points below as intended constructive suggestions to support follow-up work (GitHub repo).
Relative to growth:
- WestMed tagging-based VBGF (\(L_\infty\) ≈ 110 cm, k ≈ 0.178) remains the operational baseline; Adriatic/Central Med curves are smaller, and growth uncertainty should be carried through slicing rather than fixed.
- The Medits tables lacked columns showing individual fish weight (that I could find). Having these data for perusal could help with some of the stock delineation issues and patterns on condition factors.
- similarly, GSI data were shown for some areas (N GSA 6?), the pattern seemed to show some variable seasonality that should be explored further (actually those data must be from fishery collections because they showed many months of the year (Medits is only a few months per year depending on GSA).
- Synthetic LFD tests show slicing outcomes are sensitive to modest shifts in \(L_\infty\) and \(k\); lower \(L_\infty\) or higher \(k\) moves catch-at-age toward younger ages and lowers implied L50.
For the assessment benchmark
- For combined GSA model runs, consideration of within GSA indicators should be provided so that consistency or lack thereof between GSAs can be considered
- For SS3 configurations, data should minimally be provided at the quarter-of-the year resolution by gear and where plausible, metier. The rationale here is to improve the ability to estimate recruitment from different times within years. Without doing this, it is likely that the ability to use appropriately conditioned variation in lengths-given-age and reasonably resolve a single growth curve.
- The catch biomass data should be at the finest level of the within GSA fleet and quarter level.
- Length freqency data should be proportional to the catch for fishery data (absolute numbers-at-length are unnecessary)
- Some indication of LF sampling intensity should be available (e.g., number of hauls or trips sampled)
- Where catch biomass reliability changes, it should be noted (uncertainty of catch)
- Survey index data should note where there are lapses in distribution of tows etc. This may help to acknowledge changes in the fish’s “availability” to the survey
Other comments:
- Nominal trawl CPUE proxies (catch per HP/TRB) might be considered in future assessments or presented as a separate stock indicator. Approximate nominal rates indicate trends but are unstandardized and should be treated cautiously.
- MEDITs exploration highlights strong spatial/temporal heterogeneity in densities, lengths, and sex ratios across GSAs, underscoring the need for GSA-specific diagnostics.
- For a4a models evaluate stochastic growth in slicing and rototype sex-specific slicing where data allow.
1 Biological and Population Context
1.1 Overview
These notes attempt to synthesize information from the provided GFCM and STECF sources relevant to European hake (Merluccius merluccius) in the Mediterranean (GSAs 1–5–6–7).
- The MED_UNITs synthesis on stock structuring and biological units Spedicato and coauthors (2022).
- The GFCM Ad-hoc Working Group on European Hake (WGHKE) report (Rome, March 2025) GFCM Ad-hoc Working Group on European Hake (WGHKE) (2025).
- The STECF Ad-hoc Hake Assessment (EWG 25-06) Scientific, Technical and Economic Committee for Fisheries (STECF) (2025).
- The European hake growth study in the Gulf of Lions Mellon-Duval et al. (2010).
- Benchmark work for Adriatic and Central Mediterranean hake stocks Carbonara et al. (2023); Spedicato and coauthors (2022).
- ICATMAR complementary biological data for northern GSA 6, including historical growth parameters and VBGF curves by sex for the WestMed stock unit.
As part of the review these notes give:
- A concise comparison of growth parameters across GSAs.
- A record of how these differences intersect with the current combined assessment for GSAs 1–5–6–7.
- Practical implications for slicing, model choice, and benchmark planning.
- Some draft evaluations of Medits data for easier compmarisons over GSAs
1.2 Stock structuring and implications for growth
The MED_UNITs project used genetic markers (SNPs), otolith shape, otolith microchemistry, and environmental covariates to delineate biological units of European hake across the Mediterranean Spedicato and coauthors (2022).
Key points:
- Clear separation between Atlantic and Mediterranean hake.
- Within the Mediterranean, evidence for three main population groups: Western, Central, and Eastern.
- Otolith shape yields four robust clusters (Atlantic, Western Med, Adriatic, Eastern Med), aligning well with the genetic structuring.
- Otolith microchemistry alone has weaker classification power and higher temporal variability, making it less reliable for fine-scale stock boundaries.
While MED_UNITs does not provide new growth curves, the consistent regional differentiation supports the hypothesis that growth parameters differ among subregions and that using a single VBGF across the basin is biologically unrealistic.
1.3 Sexual dimorphism
All major sources agree that European hake shows strong sexual dimorphism in growth:
- Females attain substantially larger asymptotic lengths than males.
- Males mature earlier at smaller sizes and rarely approach the female \(L_\infty\).
- Sex-specific growth has been implemented in several recent benchmarks (Adriatic, Strait of Sicily, and GSA 19 trials) using integrated models such as Stock Synthesis Carbonara et al. (2023); Scientific, Technical and Economic Committee for Fisheries (STECF) (2025); GFCM Ad-hoc Working Group on European Hake (WGHKE) (2025).
- ICATMAR’s northern GSA 6 monitoring confirms this pattern: males dominate small sizes and rarely exceed ~45 cm, whereas females make up almost all of the largest length classes in local LFDs.
For GSAs 1–5–6–7, current combined assessments still rely primarily on sex-aggregated slicing, though both WGHKE and STECF explicitly recommend moving to sex-specific slicing whenever possible GFCM Ad-hoc Working Group on European Hake (WGHKE) (2025); Scientific, Technical and Economic Committee for Fisheries (STECF) (2025).
1.4 Overview
This note summarizes what length-frequency (LF) data exist by GSA (1, 5, 6, 7) for European hake and how these are converted into age compositions for recent STECF/GFCM work. No validated age–length keys (ALKs) or otolith-based age data are currently used for these GSAs; all age data are reconstructed from LF using von Bertalanffy growth parameters (Mellon-Duval et al. 2010; updated in the a4a/FLR framework as a stochastic VBGF centered on Linf ≈ 110 cm, k ≈ 0.178, t0 ≈ −0.04).
All GSAs share the following general features:
- Commercial LF data are available from DCF sampling since 2002 (longer in some GSAs via national sources).
- MEDITS survey LF data are available from 2002 onward (with 2020 problematic and treated as missing/NA in the index).
- Catch-at-age and survey-at-age are fully derived from LF via deterministic (and, in future, quarterly/stochastic) slicing.
1.4.1 GSA 1 – Northern Alboran Sea
- Commercial LF:
- IEO historical landings and LF back to ~2001; DCF LF consistently from 2002 onward.
- Survey LF:
- MEDITS LF from 2002 onward, except 2020 (no survey; treated as NA in the tuning index).
- Direct age data:
- None (no validated ALKs or age-readings supplied).
- Derived age data:
- Catch-at-age and survey-at-age obtained by slicing LF with sex-combined VBGF.
- Quarterly/stochastic slicing with age-0 → age-1 in Q1–Q2 has been developed but is not yet the operational standard.
1.4.2 GSA 5 – Balearic Islands
- Commercial LF:
- Landings time series from 1940; LF data from ~1980 onward (IEO/COISPA and related work).
- Survey LF:
- BALAR bottom trawl surveys (2002–2006) using MEDITS gear and protocol.
- MEDITS surveys from 2007 onward (eastern and western islands since 2021).
- Direct age data:
- None submitted (no ALKs used in current assessments).
- Derived age data:
- Combined BALAR+MEDITS LF are sliced to produce survey-at-age.
- Commercial LF sliced with sex-combined VBGF; quarterly/stochastic variants explored.
- GSA 5 also used to test large-hake LPUE (size-category based), but these indices are not yet used as assessment tuning data.
1.4.3 GSA 6 – Northern Spain
- Commercial LF:
- DCF LF for 2002–2024; longer national series from IEO and ICATMAR (including effort and gear-specific details).
- Survey LF:
- MEDITS LF for all years except 2020 (partial survey; 2020 index omitted/NA).
- Direct age data:
- None (no ALKs or agreed ageing protocol).
- Derived age data:
- Catch-at-age obtained by LF slicing (sex-combined VBGF; stochastic quarterly slicing tested in EWG 25-06).
- Additional ICATMAR information (sex ratios, longline and gillnet LPUE, historical VBGF by assessment area) is available and useful for diagnostics but not yet structurally integrated as an age-based index.
1.4.4 GSA 7 – Gulf of Lions (France and Spain)
- Commercial LF:
- DCF LF for 2002–2024 from both France and Spain.
- Survey LF:
- MEDITS LF for all years except 2020 (survey shifted to autumn; 2020 treated as NA).
- Direct age data:
- None (no ALKs provided; no common ageing protocol in use).
- Derived age data:
- Catch-at-age reconstructed from LF via slicing (sex-combined growth).
- France supplies quarterly sex-ratio-by-length data (2011–2024), used to build a synthetic sex-ratio curve; these support future sex-specific slicing but are not yet used in the operational a4a input.
1.5 LFD Comparison Table
1.6 LF only (no age data)
- There are no observed ages (ALKs or otolith age-readings) in the current operational workflow for GSAs 1, 5, 6, and 7.
- All age compositions, both for commercial catches and MEDITS/BALAR surveys, are reconstructed from LF via growth-based slicing.
- The main near-term improvements identified are:
- Adoption of quarterly and stochastic slicing (with biologically consistent handling of age-0),
- Introduction of sex-specific growth and slicing once data preparation and tools are mature,
- Future incorporation of ALKs if and when robust ageing protocols and intercalibration are in place.
1.7 Comparative Growth Parameters (VBGF)
1.7.1 Combined-sex growth
The table below compiles key VBGF parameter sets for combined sexes from the uploaded material and recent EWG 25-06 growth work.
| Region | Source | Linf_cm | k | t0 |
|---|---|---|---|---|
| GSAs 1–5–6–7 | Mellon-Duval 2010 (tagging) | 110.0 | 0.178 | -0.005 |
| GSAs 1–5–6–7 | STECF 25-06 / ICATMAR-FLR VBGF | 109.6 | 0.178 | -0.036 |
| GSAs 17–18 (Adriatic) | Adriatic SS3 benchmark | 88.0 | 0.200 | -0.100 |
| GSA 19 (Ionian) | GSA 19 SS3 trial | 90.0 | 0.200 | -0.100 |
Notes:
- The Mellon-Duval curve (Gulf of Lions) Mellon-Duval et al. (2010) is effectively the traditional default for GSAs 1–5–6–7 in the current a4a configuration.
- EWG 25-06 and the ICATMAR/a4a-FLR work implement a stochastic VBGF centered on essentially the same parameters (\(L_\infty\) ≈ 109.6 cm, k ≈ 0.178, t0 ≈ −0.04), with uncertainty carried through slicing rather than changing the mean curve.
- Both WGHKE and STECF highlight that this \(L_\infty\) (≈ 110 cm) is strongly influenced by tagging increment data, which tends to pull \(L_\infty\) toward maximum observed lengths and may not correspond to mean asymptotic length GFCM Ad-hoc Working Group on European Hake (WGHKE) (2025); Scientific, Technical and Economic Committee for Fisheries (STECF) (2025).
- In contrast, integrated models in the Adriatic (GSAs 17–18) and Central Med (GSA 19) estimate lower combined-sex \(L_\infty\) (≈ 88–90 cm), especially when growth is estimated from ALKs within Stock Synthesis Carbonara et al. (2023); Scientific, Technical and Economic Committee for Fisheries (STECF) (2025).
1.7.2 Sex-specific growth
The next table summarizes sex-specific VBGF parameters used or discussed for different regions.
| Region | Sex | Source | Linf_cm | k | t0 |
|---|---|---|---|---|---|
| GSAs 1–5–6–7 | Female | Mellon-Duval 2010 | 105 | 0.236 | 0.00 |
| GSAs 1–5–6–7 | Male | Mellon-Duval 2010 | 73 | 0.233 | 0.00 |
| GSA 8–11 | Female | WGHKE 2025 | 95 | 0.160 | -0.06 |
| GSA 8–11 | Male | WGHKE 2025 | 60 | 0.265 | -0.06 |
| GSAs 17–18 | Female | Adriatic SS3 | 90 | 0.200 | -0.10 |
| GSAs 17–18 | Male | Adriatic SS3 | 65 | 0.240 | -0.10 |
| GSA 19 | Female | GSA 19 SS3 | 95 | 0.200 | -0.10 |
| GSA 19 | Male | GSA 19 SS3 | 70 | 0.250 | -0.10 |
Observations:
- In GSAs 1–5–6–7, females have \(L_\infty\) ≈ 100–105 cm, males ≈ 70–75 cm, with similar k; this is consistent with Mellon-Duval et al. (2010) and with the ICATMAR compilation of historical VBGF-by-sex for the WestMed stock unit.
- In GSA 8–11, sex-specific curves are clearly smaller for females (\(L_\infty\) = 95 cm) and males (\(L_\infty\) = 60 cm) relative to GSAs 1–5–6–7 GFCM Ad-hoc Working Group on European Hake (WGHKE) (2025).
- Adriatic and Central Med (GSAs 17–18 and 19) show \(L_\infty\) in the range 90–95 cm for females, lower than the West Med tagging-based estimates Carbonara et al. (2023); Scientific, Technical and Economic Committee for Fisheries (STECF) (2025).
- EWG 25-06 concluded that, given current evidence (including ICATMAR growth work), there is no strong empirical basis yet to replace the WestMed VBGF, but that growth uncertainty should be propagated explicitly (stochastic slicing, alternative \(L_\infty\) sensitivities) rather than treated as fixed truth.
1.7.3 Gradient in \(L_\infty\) across the basin
The plot below shows a schematic gradient in female \(L_\infty\) across broad regions.
Interpretation:
- The highest \(L_\infty\) for females is consistently in the Western Med combined (1–5–6–7), especially when tagging information is included.
- Adjacent Western Med (8–11) and Central Med (17–18, 19) show somewhat lower \(L_\infty\), even before considering bi-phasic growth.
- Eastern Med is data-poorer in these documents, but where estimates exist, they generally do not exceed the West-Med tagging-based values.
1.7.4 Schematic “growth map” by GSA
| GSA | Subregion | Growth_Comment |
|---|---|---|
| 1 | West | High \(L_\infty\); combined assessment; tagging-based |
| 3 | West | Distinct growth; TRANSBORAN boundary with GSA 1 |
| 4 | West | Potential bias from age-0 sampling; slower than 1–5–6–7 |
| 5 | West | Part of combined stock; BALAR survey provides additional structure |
| 6 | West | Intense exploitation; strong truncation of larger sizes; ICATMAR LF show males truncating near 45 cm while females dominate larger sizes |
| 7 | West | Mixed fleets (ESP+FRA); gillnets/longlines historically important |
| 8–11 | West Adjacent | Lower female \(L_\infty\); sex-specific slicing in stock setup |
| 17–18 | Central | Lower \(L_\infty\); bi-phasic SS3 growth preferred |
| 19 | Central | Sex-structured SS3; moderate \(L_\infty\); multi-fleet fishery |
| 22–27 | East | Wide size range; less truncated length structure |
2 Assessment impacts–growth slicing sensitivity
The VBGF for expected length at age is
\[ L(a) = L_\infty \bigl(1 - e^{-k\,(a - t_0)}\bigr), \]
and observed lengths are assumed \[ L \mid a \sim \mathcal N\!\left(L(a),\, \sigma_L^2\right). \]
Four options are compared: - Base_WestMed: tagging-informed West Med curve (110 cm, \(k = 0.178\), \(t_0 \approx 0\)). - Lower_Linf: shrinks \(L_\infty\) (95 cm) and modestly increases \(k\) to mimic slower growth. - Higher_k: keeps \(L_\infty\) but steepens \(k\) to 0.25, shifting growth leftward. - Adriatic_SS3: uses a smaller Adriatic-style asymptote (88 cm) and earlier \(t_0\).
Code
vb_len <- function(age, Linf, k, t0) Linf * (1 - exp(-k * (age - t0)))
growth_scenarios <- tibble::tribble(
~scenario, ~Linf, ~k, ~t0,
"Base_WestMed", 110, 0.178, -0.005,
"Lower_Linf", 95, 0.20, -0.10,
"Higher_k", 110, 0.20, -0.20,
"Adriatic_SS3", 88, 0.20, -0.10
)
ages <- 0:6
len_sd <- 2
L_grid <- seq(5, 70, 0.5)
dL_given_a <- function(L, age, Linf, k, t0, sigma = len_sd) {
mu <- vb_len(age, Linf, k, t0)
dnorm(L, mu, sigma)
}2.0.1 P(a|L) heatmaps
Code
pal_grid <- growth_scenarios |>
tidyr::crossing(age = ages, L = L_grid) |>
rowwise() |>
mutate(density_L_given_a = dL_given_a(L, age, Linf, k, t0)) |>
ungroup() |>
group_by(scenario, L) |>
mutate(p_a_given_L = density_L_given_a / sum(density_L_given_a)) |>
ungroup()Here the conditional probability of age given length for a discrete set of ages is \[ P(a \mid L) = \frac{f(L \mid a)}{\sum_{a'} f(L \mid a')}, \] where \(f(L \mid a)\) is the normal density above. Changing \(L_\infty\), \(k\), or \(t_0\) shifts the relative weight across ages for the same observed length.
Code
ggplot(pal_grid, aes(L, factor(age), fill = p_a_given_L)) +
geom_tile() +
facet_wrap(~ scenario) +
scale_fill_viridis_c() +
labs(x="Length (cm)", y="Age", title="P(age|L)") +
theme_few()As \(L_\infty\) drops (e.g., Lower_Linf), younger ages gain probability mass at a given length, while a steeper \(k\) (Higher_k) compresses early growth and can shift modal ages downward for small fish but upward for larger fish. Figure 2 shows these shifts across scenarios.
Code
len_pdf <- pal_grid |>
group_by(scenario, age) |>
mutate(p_L_given_a = density_L_given_a / sum(density_L_given_a)) |>
ungroup()2.0.2 Slicing impacts shown as ΔCAA
Given an LFD with counts \(n_L\) at mid-point \(L\), slicing under a given growth option is \[ \mathrm{CAA}_a = \sum_L P(a \mid L)\, n_L, \] which collapses length bins into ages using the option-specific \(P(a \mid L)\).
Code
set.seed(123)
true_age_probs <- c(0.30,0.30,0.20,0.10,0.06,0.03,0.01)
names(true_age_probs) <- ages
n_fish <- 5000
true_ages <- sample(ages, n_fish, TRUE, true_age_probs)
true_growth <- growth_scenarios |> filter(scenario=="Base_WestMed")
sim_lengths <- tibble(age=true_ages) |>
mutate(L = vb_len(age, true_growth$Linf, true_growth$k, true_growth$t0) +
rnorm(n(), 0, len_sd))
sim_LFD <- sim_lengths |>
mutate(L_bin = cut(L, breaks=seq(0,80,1), right=FALSE)) |>
count(L_bin) |>
filter(!is.na(L_bin)) |>
mutate(L_mid = as.numeric(sub("\\[(.*),(.*)\\)","\\1",L_bin))+0.5)
CAA_by_scenario <- pal_grid |>
mutate(L_mid=L) |>
inner_join(sim_LFD, by="L_mid") |>
group_by(scenario, age) |>
summarise(CAA=sum(p_a_given_L*n), .groups="drop")
base_scen <- "Base_WestMed"
CAA_wide <- CAA_by_scenario |>
tidyr::pivot_wider(names_from=scenario, values_from=CAA)
CAA_long_diff <- CAA_wide |>
pivot_longer(-age, names_to="scenario", values_to="CAA") |>
group_by(age) |>
mutate(CAA_base = CAA[scenario==base_scen],
Delta_CAA = CAA - CAA_base) |>
ungroup() |>
filter(scenario != base_scen)
ggplot(CAA_long_diff, aes(factor(age), scenario, fill=Delta_CAA)) +
geom_tile() +
scale_fill_gradient2() +
labs(x="Age", y="Scenario", title="ΔCAA vs Base Growth") +
theme_few()Relative to the base curve, shrinking \(L_\infty\) moves fish from older to younger ages, increasing CAA for ages 0–2 and decreasing it for ages 4–6. Higher \(k\) behaves similarly but with a milder gradient; the Adriatic-style curve has the strongest shift toward young ages. See Figure 3.
2.0.3 Modal age flips
Modal ages flip earliest in the 20–30 cm range; larger \(L_\infty\) holds older modal ages at intermediate lengths, so reducing \(L_\infty\) pushes the mode younger for much of the length range. A “flip” means the age with highest \(P(a \mid L)\) differs between two growth assumptions; e.g., a 25 cm fish might be modal age 2 under the base curve but modal age 1 under the lower \(L_\infty\) curve. These flips are where slicing outcomes are most sensitive to the growth choice because a small change in assumed curve reassigns the modal age for an entire length bin. Figure 4 highlights the affected length bins.
Code
scen_A <- "Base_WestMed"
scen_B <- "Lower_Linf"
modal_age_by_L <- pal_grid |>
filter(scenario %in% c(scen_A, scen_B)) |>
group_by(scenario, L) |>
slice_max(p_a_given_L) |>
ungroup() |>
select(scenario, L, modal_age=age)
modal_wide <- modal_age_by_L |>
pivot_wider(names_from=scenario, values_from=modal_age,
names_prefix="age_") |>
mutate(flip = age_Base_WestMed != age_Lower_Linf)
ggplot(modal_wide, aes(L, 0, fill=flip)) +
geom_tile(height=1) +
scale_fill_manual(values=c("grey80","red")) +
labs(x="Length", y=NULL, title="Length bins where modal age flips") +
theme_few() +
theme(axis.text.y=element_blank(), axis.ticks.y=element_blank())2.0.4 Derived weight-at-age under each growth option
Assuming \(W = \alpha L^\beta\) with \(\alpha = 0.0000067\) and \(\beta = 3.035\), expected weight at age is the probability-weighted integral over length: \[ E[W \mid a] = \alpha \sum_L L^\beta \, P(L \mid a). \]
Code
alpha_w <- 0.0000067
beta_w <- 3.035
weight_age <- len_pdf |>
group_by(scenario, age) |>
summarise(E_W = alpha_w * sum((L^beta_w) * p_L_given_a),
E_L = sum(L * p_L_given_a),
.groups="drop")
ggplot(weight_age, aes(age, E_W, color=scenario)) +
geom_line() +
geom_point() +
labs(x="Age", y="Expected weight (kg)", title="Weight-at-age implied by growth options") +
theme_few()Lower \(L_\infty\) yields sharply lower weight for ages 3+, while higher \(k\) mainly affects ages 0–2. The Adriatic-style curve gives the lightest older fish because both \(L_\infty\) and \(t_0\) pull lengths downward. Figure 5 summarizes the expected weight trajectories.
2.0.5 Stochastic slicing
Code
set.seed(456)
n_draws <- 200
base <- growth_scenarios |> filter(scenario==base_scen)
draws <- tibble(
draw=1:n_draws,
Linf=rnorm(n_draws, base$Linf, 3),
k=rnorm(n_draws, base$k, 0.02),
t0=rnorm(n_draws, base$t0, 0.05)
)
slice_once <- function(Linf, k, t0) {
pal <- tibble(L=L_grid) |>
tidyr::crossing(age=ages) |>
rowwise() |>
mutate(density=dL_given_a(L, age, Linf, k, t0)) |>
ungroup() |>
group_by(L) |>
mutate(p=density/sum(density)) |>
ungroup() |>
mutate(L_mid=L)
pal |>
inner_join(sim_LFD, by="L_mid") |>
group_by(age) |>
summarise(CAA=sum(p*n), .groups="drop")
}
CAA_stoch <- draws |>
mutate(CAA = purrr::pmap(list(Linf,k,t0), slice_once)) |>
tidyr::unnest(CAA)
CAA_summary <- CAA_stoch |>
group_by(age) |>
summarise(CAA_mean=mean(CAA),
CAA_lo=quantile(CAA,0.05),
CAA_hi=quantile(CAA,0.95))This “stochastic slicing” draws \((L_\infty, k, t_0)\) from normal priors centered on the base West Med values. The resulting fan of \(\mathrm{CAA}_a\) reflects uncertainty in the growth curve rather than sampling error in the LFD.
Code
ggplot(CAA_summary, aes(factor(age), CAA_mean)) +
geom_col() +
geom_errorbar(aes(ymin=CAA_lo, ymax=CAA_hi), width=0.2) +
labs(x="Age", y="CAA", title="Stochastic slicing uncertainty") +
theme_few()Even modest growth uncertainty (SD of 3 cm in \(L_\infty\) and 0.02 in \(k\)) produces wide CAA intervals for ages 0–3. The width narrows for older ages because relatively few fish are assigned there under any plausible curve, so variance in \(P(a \mid L)\) matters less.
2.0.6 Weight-at-age under growth uncertainty
The same draws of \((L_\infty, k, t_0)\) propagate to weight-at-age via \(W = \alpha L^\beta\). For each draw, we compute \(E[W \mid a]\) using \(P(L \mid a)\) and then summarise the envelope.
Code
weight_once <- function(Linf, k, t0) {
tibble(L=L_grid) |>
tidyr::crossing(age=ages) |>
rowwise() |>
mutate(density=dL_given_a(L, age, Linf, k, t0)) |>
ungroup() |>
group_by(age) |>
mutate(p_L_given_a = density/sum(density)) |>
summarise(W = alpha_w * sum((L^beta_w) * p_L_given_a),
.groups="drop")
}
weight_stoch <- draws |>
mutate(weight = purrr::pmap(list(Linf,k,t0), weight_once)) |>
tidyr::unnest(weight)
weight_stoch_summary <- weight_stoch |>
group_by(age) |>
summarise(W_mean = mean(W),
W_lo = quantile(W, 0.05),
W_hi = quantile(W, 0.95),
.groups="drop")Code
ggplot(weight_stoch_summary, aes(age, W_mean)) +
geom_ribbon(aes(ymin=W_lo, ymax=W_hi), fill="grey80") +
geom_line() +
geom_point() +
labs(x="Age", y="Expected weight (kg)", title="Weight-at-age uncertainty from growth draws") +
theme_few()Growth uncertainty pushes a wide band of weight-at-age for ages 2–4, where small changes in \(L_\infty\) and \(k\) translate to large cubic changes in weight. Younger ages show less spread because all curves predict short lengths there, and older ages collapse as very few fish are assigned to them. Figure 7 shows the envelope.
2.0.7 Proportion mature-at-age
Assuming a length-based maturity ogive with 50% maturity at 29 cm and slope parameter \(s=0.5\), \[ p_{\text{mat}}(L) = \frac{1}{1 + e^{-s(L - 26)}}. \] The maturity-at-age for each growth option integrates this ogive over \(P(L \mid a)\).
Code
L50 <- 26
s_mat <- 0.5
pmat <- function(L) 1 / (1 + exp(-s_mat * (L - L50)))
mature_age <- len_pdf |>
mutate(p_mat = pmat(L)) |>
group_by(scenario, age) |>
summarise(p_mature = sum(p_mat * p_L_given_a), .groups="drop")
ggplot(mature_age, aes(age, p_mature, color=scenario)) +
geom_line() +
geom_point() +
labs(x="Age", y="Proportion mature", title="Proportion mature-at-age by growth option") +
theme_few()Lower \(L_\infty\) accelerates maturity (ages 1–2 cross 50% earlier), while the base and high-\(k\) curves keep a longer immature tail. This illustrates how growth choice not only shifts CAA but also maturity ogives used in spawning-stock calculations. See Figure 8.
2.1 Sensitivity: spawning biomass per recruit
Using the weight-at-age and maturity-at-age above, a natural mortality schedule \(M_a = (1.155, 0.61, 0.437, 0.353, 0.304, 0.273)\) for ages 0–5+, and a knife-edge fishery selectivity at 18 cm (probability of capture = 0 below 18 cm, 1 above), the spawning biomass per recruit (SSB/R) under a unit fully-selected fishing mortality (\(F=1\)) is \[ \text{SSB/R} = \sum_{a=0}^{A-1} N_a e^{-Z_a/2} W_a p_{\text{mat},a} + N_A \frac{e^{-Z_A/2}}{1 - e^{-Z_A}} W_A p_{\text{mat},A}, \] with \(Z_a = M_a + F s_a\), \(N_0 = 1\), and \(N_{a+1} = N_a e^{-Z_a}\) for \(a < A\).
Code
M_sched <- c(1.155, 0.61, 0.437, 0.353, 0.304, 0.273)
M_at_age <- c(M_sched[1:6], M_sched[6]) # extend 5+ to age 6
F_full <- 1
L_sel <- 18
sel_at_age <- len_pdf |>
mutate(sel = ifelse(L >= L_sel, 1, 0)) |>
group_by(scenario, age) |>
summarise(sel = sum(sel * p_L_given_a), .groups="drop")
ssbpr_calc <- function(scen) {
waa <- weight_age |> filter(scenario == scen) |> arrange(age) |> pull(E_W)
mat <- mature_age |> filter(scenario == scen) |> arrange(age) |> pull(p_mature)
sel <- sel_at_age |> filter(scenario == scen) |> arrange(age) |> pull(sel)
Z <- M_at_age + F_full * sel
A <- length(ages) - 1
N <- numeric(length(ages))
N[1] <- 1
for (i in 1:A) {
N[i+1] <- N[i] * exp(-Z[i])
}
ssb <- sum(N[1:A] * exp(-0.5 * Z[1:A]) * waa[1:A] * mat[1:A])
ssb_plus <- N[A+1] * exp(-0.5 * Z[A+1]) / (1 - exp(-Z[A+1])) * waa[A+1] * mat[A+1]
tibble(scenario = scen, SSBPR = ssb + ssb_plus)
}
ssbpr <- growth_scenarios$scenario |>
purrr::map_dfr(ssbpr_calc) |>
mutate(rel_base = SSBPR / SSBPR[scenario == "Adriatic_SS3"])Code
ggplot(ssbpr, aes(scenario, SSBPR, fill=scenario)) +
geom_col() +
labs(x=NULL, y="SSB per recruit (kg)", title="SSB/R sensitivity to growth assumptions") +
theme_few() +
theme(legend.position="none")The Adriatic-style growth produces the lowest SSB/R because smaller \(L_\infty\) depresses weight and maturity at age, while the tagging-based Base_WestMed yields the highest SSB/R under the same \(M\) and \(F\) assumptions. Higher_k sits slightly below the base because earlier growth moves fish into the fishery (18 cm) sooner, increasing fishing mortality exposure before they reach larger weights.
Unfished SSB/R differs by more than two-fold between the tagging-informed base and the Adriatic-style curve, underscoring how growth choice alone changes the starting point for spawning potential calculations.
2.1.1 Impact on proxy \(F_{MSY}\) estimates
Varying \(F\) from 0 to 1.0 (fully selected) shows how the growth options change the slope of the SSB/R curve. Steeper curves imply greater sensitivity to fishing.
Code
F_grid <- seq(0, 1.0, by=0.025)
ssbpr_one <- function(scen, Fval) {
waa <- weight_age |> filter(scenario == scen) |> arrange(age) |> pull(E_W)
mat <- mature_age |> filter(scenario == scen) |> arrange(age) |> pull(p_mature)
sel <- sel_at_age |> filter(scenario == scen) |> arrange(age) |> pull(sel)
Z <- M_at_age + Fval * sel
A <- length(ages) - 1
N <- numeric(length(ages))
N[1] <- 1
for (i in 1:A) {
N[i+1] <- N[i] * exp(-Z[i])
}
ssb <- sum(N[1:A] * exp(-0.5 * Z[1:A]) * waa[1:A] * mat[1:A])
ssb_plus <- N[A+1] * exp(-0.5 * Z[A+1]) / (1 - exp(-Z[A+1])) * waa[A+1] * mat[A+1]
ssb + ssb_plus
}
ssbpr_F <- expand_grid(scenario = growth_scenarios$scenario, F = F_grid) |>
mutate(SSBPR = purrr::map2_dbl(scenario, F, ssbpr_one))
ssbpr_ref0 <- ssbpr_F |>
filter(scenario == "Adriatic_SS3", F == 0) |>
pull(SSBPR)
ssbpr_F <- ssbpr_F |>
mutate(SSBPR_pct = 100*SSBPR / ssbpr_ref0)Code
ggplot(ssbpr_F, aes(F, SSBPR_pct, color=scenario)) +
geom_line() +
labs(x="Fully selected F", y="SSB/R (% of Adriatic_SS3 F=0)", title="SSB/R profile by growth option") + geom_hline(yintercept = 35,
color='grey', linetype = "dashed") +
scale_y_continuous(labels = scales::label_percent(scale = 1)) +
theme_few()All curves decline as \(F\) rises, but the lower-\(L_\infty\) and higher-\(k\) options drop faster because fish become vulnerable earlier and contribute less weight before capture. Expressing everything relative to the Adriatic_SS3 unfished level highlights how quickly each option erodes spawning potential compared to that conservative baseline (the horizontal dashed line is \(F_{35\%}\) where it crosses the other lines). Figure 9 and Figure 10 show the level and shape of SSB/R across \(F\).
2.2 SS3-style growth parameters and length-at-age densities
Stock Synthesis 3 often parameterizes the VBGF with two fixed size-at-age points and \(k\): \[ L_t = L_\infty + (L_1 - L_\infty)e^{-k\,(t-A_1)}, \] where \(L_1\) is length at age \(A_1\) (young age) and \(L_\infty\) is derived from a second size-at-age point \((A_2, L_2)\) as \[ L_\infty = L_1 + \frac{L_2 - L_1}{1 - e^{-k\,(A_2 - A_1)}}. \] Here we take \(A_1 = 0.5\) yr and \(A_2 = 4.5\) yr. The current VBGF parameters map to these SS3 inputs as follows:
Code
ss3_params <- growth_scenarios |>
mutate(L1 = vb_len(0.5, Linf, k, t0),
L2 = vb_len(4.5, Linf, k, t0),
k_param = k) |>
select(scenario, L1, L2, k_param)
knitr::kable(ss3_params, digits=2,
col.names=c("Scenario","L1 (0.5 yr, cm)","L2 (4.5 yr, cm)","k"))| Scenario | L1 (0.5 yr, cm) | L2 (4.5 yr, cm) | k |
|---|---|---|---|
| Base_WestMed | 9.46 | 60.67 | 0.18 |
| Lower_Linf | 10.74 | 57.14 | 0.20 |
| Higher_k | 14.37 | 67.03 | 0.20 |
| Adriatic_SS3 | 9.95 | 52.93 | 0.20 |
To illustrate implied observation error on length-at-age under this parameterization, assume the CV of length-at-age is 0.1 at age \(A_1\) and 0.09 at age \(A_2\), linearly interpolated between and held flat beyond. Length-at-age is then modeled as \(L \mid a \sim \mathcal N(\mu_a, (\text{CV}_a \mu_a)^2)\) with \(\mu_a\) from the mapped growth curve.
Code
cv_points <- tibble(age=c(0.5, 4.5), cv=c(0.10, 0.09))
cv_fun <- function(age_vec) approx(cv_points$age, cv_points$cv, xout=age_vec, rule=2)$y
ages_ss3 <- seq(0.5, 5, by=0.5)
L_grid_ss3 <- seq(5, 120, by=0.5)
dens_ss3 <- growth_scenarios |>
tidyr::crossing(age = ages_ss3, L = L_grid_ss3) |>
rowwise() |>
mutate(mu = vb_len(age, Linf, k, t0),
cv = cv_fun(age),
sigma = cv * mu,
density = dnorm(L, mu, sigma)) |>
ungroup()
# Build mirrored polygons by age to mimic SS3 growth plot; shown for Base_WestMed
poly_all <- dens_ss3 |>
filter(age %in% c(1, 2, 3, 4)) |>
filter(scenario %in% c("Base_WestMed","Adriatic_SS3")) |>
filter(density >= 1e-5) |>
group_by(scenario, age) |>
arrange(L) |>
mutate(width = density / max(density) * 0.15) |>
summarise(
age_poly = c(age - width, rev(age + width)),
L_poly = c(L, rev(L)),
scenario = first(scenario),
grp = cur_group_id(),
.groups="drop"
)Code
ggplot(poly_all, aes(age_poly, L_poly, group=interaction(grp, scenario), fill=scenario, color=scenario)) +
geom_polygon(alpha=0.5) +
labs(x="Age (yr)", y="Length (cm)", title="Length-at-age densities (ages 1–4) by scenario") +
coord_cartesian(ylim=c(min(L_grid_ss3), 80)) +
theme_minimal()These densities form the matrix \(f(L \mid a)\) that SS3 would use given the mapped parameters and CV schedule. The tagging-informed curve pushes means rightward, while the Adriatic-style curve keeps most probability mass at shorter lengths; decreasing CV with age tightens older-age densities around their means.
2.3 Age versus length-based maturity
If we imagine that spawning is directly linked to fish age rather than size, we can illustrate how a fixed (over time) age-based ogive can cause variability in the apparent L50. To examine this we simulated a 20-year population using the Base_WestMed growth and the same natural mortality schedule. Recruitment at age 0 follows a lognormal distribution with \(\sigma_R = 0.8\) (mean 1e6). For this example, survival is driven only by \(M\) (no fishing). Maturity is 50% at age 2 (age-50) with a steep logistic slope; we then infer the implied length at 50% maturity each year using the Base_WestMed \(P(L \mid a)\) matrix.
Code
set.seed(999)
years <- 1:20
ages_pop <- ages
sigma_R <- 0.8
mean_rec <- 1e6
rec <- rlnorm(length(years), log(mean_rec) - 0.5 * sigma_R^2, sigma_R)
M_vec <- M_at_age[1:length(ages_pop)]
N_mat <- matrix(0, nrow=length(years), ncol=length(ages_pop))
colnames(N_mat) <- paste0("age_", ages_pop)
N_mat[,1] <- rec
for (y in 2:length(years)) {
for (a in 1:(length(ages_pop)-2)) {
N_mat[y, a+1] <- N_mat[y-1, a] * exp(-M_vec[a])
}
pg <- length(ages_pop)
N_mat[y, pg] <- N_mat[y-1, pg-1] * exp(-M_vec[pg-1])
#+ N_mat[y-1, pg] * exp(-M_vec[pg])
}
N_long <- as_tibble(N_mat) |>
mutate(year = years) |>
pivot_longer(starts_with("age_"), names_to="age_lab", values_to="N") |>
mutate(age = as.numeric(sub("age_","", age_lab))) |>
select(year, age, N)
# Base_WestMed length-at-age matrix with extended length grid
L_grid_long <- seq(5, 120, by=0.5)
base_growth <- growth_scenarios |> filter(scenario == "Base_WestMed")
len_probs_base <- tidyr::crossing(age = ages_pop, L = L_grid_long) |>
rowwise() |>
mutate(density = dL_given_a(L, age, base_growth$Linf, base_growth$k, base_growth$t0)) |>
ungroup() |>
group_by(age) |>
mutate(p_L_given_a = density / sum(density)) |>
ungroup()
# Age-based maturity: 50% at age 2 with steep slope
p_mat_age <- tibble(age = ages_pop,
p_mat = plogis((age - 2) * 5))Code
ggplot(p_mat_age, aes(age, p_mat)) +
geom_line(color="firebrick") +
geom_point(color="firebrick") +
scale_y_continuous(labels=scales::label_percent(accuracy=1)) +
labs(x="Age (yr)", y="Maturity", title="Age-based maturity ogive (A50 = 2 yr)") +
theme_few()Code
ggplot(tibble(year=years, recruitment=rec), aes(year, recruitment)) +
geom_col(fill="grey60") +
labs(x="Year", y="Recruitment (age 0)", title="Simulated recruitment (lognormal, sigma=0.8)") +
theme_few()
calc_L50 <- function(y) {
N_y <- N_long |> filter(year == y) |> select(age, N)
df <- len_probs_base |>
left_join(N_y, by="age") |>
left_join(p_mat_age, by="age")
total_L <- df |>
group_by(L) |>
summarise(total = sum(N * p_L_given_a, na.rm=TRUE),
mature = sum(N * p_L_given_a * p_mat, na.rm=TRUE),
.groups="drop")
if (all(total_L$total == 0)) return(NA_real_)
prop <- total_L$mature / total_L$total
idx <- which(prop >= 0.5)[1]
if (is.na(idx)) return(NA_real_)
if (idx == 1) return(total_L$L[1])
L1 <- total_L$L[idx-1]; L2 <- total_L$L[idx]
p1 <- prop[idx-1]; p2 <- prop[idx]
L1 + (0.5 - p1) * (L2 - L1) / (p2 - p1)
}
L50_ts <- tibble(year = years,
L50_length = purrr::map_dbl(years, calc_L50)) |> filter(year>5)Code
ggplot(L50_ts, aes(year, L50_length)) +
geom_line(color="steelblue") +
geom_point(color="steelblue") +
labs(x="Year", y="Implied L50 (cm)", title="Length at 50% maturity implied by age-50=2") +
theme_few()Even with a fixed age-50 at 2 years, the implied length-50 varies across years because recruitment swings change the age structure that feeds into the \(P(L \mid a)\) matrix. High recruitment years (more young fish) pull the realized L50 downward; cohorts aging through the population push it upward.
Repeating the same calculation for the Adriatic_SS3 growth option:
Code
adr_growth <- growth_scenarios |> filter(scenario == "Adriatic_SS3")
len_probs_adr <- tidyr::crossing(age = ages_pop, L = L_grid_long) |>
rowwise() |>
mutate(density = dL_given_a(L, age, adr_growth$Linf, adr_growth$k, adr_growth$t0)) |>
ungroup() |>
group_by(age) |>
mutate(p_L_given_a = density / sum(density)) |>
ungroup()
calc_L50_scen <- function(probs_mat) {
function(y) {
N_y <- N_long |> filter(year == y) |> select(age, N)
df <- probs_mat |>
left_join(N_y, by="age") |>
left_join(p_mat_age, by="age")
total_L <- df |>
group_by(L) |>
summarise(total = sum(N * p_L_given_a, na.rm=TRUE),
mature = sum(N * p_L_given_a * p_mat, na.rm=TRUE),
.groups="drop")
if (all(total_L$total == 0)) return(NA_real_)
prop <- total_L$mature / total_L$total
idx <- which(prop >= 0.5)[1]
if (is.na(idx)) return(NA_real_)
if (idx == 1) return(total_L$L[1])
L1 <- total_L$L[idx-1]; L2 <- total_L$L[idx]
p1 <- prop[idx-1]; p2 <- prop[idx]
L1 + (0.5 - p1) * (L2 - L1) / (p2 - p1)
}
}
L50_base <- tibble(
scenario="Base_WestMed",
year = years,
L50_length = purrr::map_dbl(years, calc_L50_scen(len_probs_base))
) |> filter(year>5)
L50_adr <- tibble(
scenario="Adriatic_SS3",
year = years,
L50_length = purrr::map_dbl(years, calc_L50_scen(len_probs_adr))
) |> filter(year>5)
L50_both <- bind_rows(L50_base, L50_adr)Code
ggplot(L50_both, aes(year, L50_length, color=scenario)) +
geom_line() +
geom_point() +
labs(x="Year", y="Implied L50 (cm)", title="Length at 50% maturity (age-50=2) by growth option") +
theme_few()The Adriatic_SS3 option yields consistently lower implied length-50 because its growth curve keeps lengths smaller at a given age. Interannual variability remains driven by recruitment swings, but the baseline level is pulled down relative to the West Med curve.
3 MEDITs exploratory plots (HKE)
This section folds in the earlier MEDITs exploratory notebook so survey diagnostics and growth slicing live together under one table of contents. The scripts were drafted from informal queries; please review before reuse.
3.0.1 Data setup
Code
ta <- qread("TA_HKE.qs") |> filter(area != 2)
tb <- qread("TB_HKE.qs") |> filter(area != 2, genus=="MERL")
tc <- qread("TC_HKE.qs") |> filter(area != 2, genus=="MERL")3.1 How the tables relate
ta: haul-level metadata (area, vessel, gear, positions, depth, duration, validity).tb: haul-level species totals (genus/species codes, numbers by sex categories).tc: length-frequency for species within hauls (length classes, counts).
Common keys to link: country, area, vessel, year, month, day, haul_number, codend_closing, partit.
3.2 Merluccius extraction
Filter for hake (genus == "MERL" and species == "MEL") and join lengths to hauls to get year and haul context.
Code
# Standard key for joins
join_keys <- c("country","area","vessel","year","month","day","haul_number","codend_closing","partit")
# Merluccius in species totals (tb)
hake_tb <- tb |>
filter(genus == "MERL", species %in% c("MEL", "MER"))
# Merluccius in length data (tc)
hake_tc <- tc |>
filter(genus == "MERL", species %in% c("MEL", "MER"))
# helper: convert ddmm.mm format to decimal degrees
dms_to_dd <- function(x) {
deg <- floor(x / 100)
min <- x - deg * 100
deg + min / 60
}
# Join lengths to haul-year info via ta (carry distance and wing_opening for swept area)
hake_len <- hake_tc |>
inner_join(
ta |> select(any_of(c(join_keys, "distance", "wing_opening",
"shooting_latitude", "shooting_longitude", "shooting_quadrant"))),
by = join_keys
)
glimpse(hake_len)Rows: 84,171
Columns: 31
$ ...1 <dbl> 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 366, 367, 3…
$ id <dbl> 42855146, 42855147, 42855148, 42855149, 42855150, 4…
$ country <chr> "FRA", "FRA", "FRA", "FRA", "FRA", "FRA", "FRA", "F…
$ area <dbl> 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, …
$ vessel <chr> "LEU", "LEU", "LEU", "LEU", "LEU", "LEU", "LEU", "L…
$ year <dbl> 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 201…
$ haul_number <dbl> 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 75, 75, 75,…
$ codend_closing <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "…
$ partit <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "…
$ genus <chr> "MERL", "MERL", "MERL", "MERL", "MERL", "MERL", "ME…
$ species <chr> "MER", "MER", "MER", "MER", "MER", "MER", "MER", "M…
$ codlon <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "…
$ pfrac <dbl> 3085, 3085, 3085, 3085, 3085, 3085, 3085, 3085, 308…
$ pechan <dbl> 470, 470, 470, 470, 470, 470, 470, 470, 470, 470, 6…
$ sex <chr> "I", "I", "I", "I", "I", "I", "I", "I", "I", "I", "…
$ nbsex <dbl> 72, 72, 72, 72, 72, 72, 72, 72, 72, 72, 2, 2, 60, 6…
$ length_class <dbl> 75, 80, 85, 90, 95, 100, 105, 110, 115, 125, 355, 3…
$ maturity <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "…
$ nblon <dbl> 1, 2, 4, 14, 13, 14, 12, 6, 5, 1, 1, 1, 2, 7, 7, 12…
$ matsub <chr> "ND", "ND", "ND", "ND", "ND", "ND", "ND", "ND", "ND…
$ tf <chr> "TC", "TC", "TC", "TC", "TC", "TC", "TC", "TC", "TC…
$ month <dbl> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, …
$ day <dbl> 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23,…
$ catfau <chr> "Ao", "Ao", "Ao", "Ao", "Ao", "Ao", "Ao", "Ao", "Ao…
$ upload_id <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ name_of_survey <chr> "MEDITS", "MEDITS", "MEDITS", "MEDITS", "MEDITS", "…
$ distance <dbl> 2833, 2833, 2833, 2833, 2833, 2833, 2833, 2833, 283…
$ wing_opening <dbl> 195, 195, 195, 195, 195, 195, 195, 195, 195, 195, 2…
$ shooting_latitude <dbl> 4258.23, 4258.23, 4258.23, 4258.23, 4258.23, 4258.2…
$ shooting_longitude <dbl> 426.13, 426.13, 426.13, 426.13, 426.13, 426.13, 426…
$ shooting_quadrant <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
3.3 Approximate densities (numbers per km²)
Compute station-level swept area as distance * wing_opening (m², converted to km²) and total hake numbers per station. Plot the distribution of numbers per km² by year and area.
Code
station_density <- hake_len |>
group_by(country, area, vessel, year, month, day, haul_number, codend_closing, partit,
distance, wing_opening, shooting_latitude, shooting_longitude, shooting_quadrant) |>
summarise(n_total = sum(nblon, na.rm = TRUE), .groups = "drop") |>
mutate(
swept_area_km2 = (distance * wing_opening) / 1e6,
density_n_km2 = n_total / swept_area_km2
)
station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(
area = factor(area),
period = paste0(floor((year - min(year, na.rm = TRUE)) / 5) * 5 +
min(year, na.rm = TRUE), "–",
floor((year - min(year, na.rm = TRUE)) / 5) * 5 +
min(year, na.rm = TRUE) + 4)
) |>
ggplot(aes(density_n_km2)) +
geom_histogram(fill = "steelblue", alpha = 0.7, boundary = 0) +
scale_x_log10() +
facet_grid(period ~ area, scales = "free_y") +
labs(
x = "Estimated density (numbers / km^2, log scale)",
y = "Count of stations",
title = "Station-level Merluccius density proxy by 5-year period and area"
) +
ggthemes::theme_few() +
theme(panel.border = element_blank())The faceted density plot caps densities at 500 fish/km² and shows haul-level distributions by year (rows) and area (columns); capping plus binning keeps long tails readable.
Code
station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(area = factor(area),
dens_cap = pmin(density_n_km2, 500)) |>
ggplot(aes(dens_cap)) +
geom_histogram(binwidth = 25, fill = "darkorange", alpha = 0.7, boundary = 0) +
facet_wrap(~ area, scales = "free_y") +
labs(
x = "Estimated density (numbers / km^2)",
y = "Count of stations",
title = "Station-level Merluccius density proxy by area (all years)"
) +
ggthemes::theme_few()This collapses all years, still capping at 500, to compare overall density shapes by area.
Code
station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(year = factor(year), area = factor(area)) |>
ggplot(aes(year, density_n_km2, fill = area, color = area)) +
geom_boxplot(outlier.alpha = 0.4) +
scale_y_log10(limits = c(NA, 750)) +
labs(
x = "Year",
y = "Estimated density (numbers / km^2)",
title = "Station-level density boxplots by year and area",
fill = "Area",
color = "Area"
) +
ggthemes::theme_few()Boxplots summarise the log-scaled station densities by year and area, highlighting medians and spread.
Code
station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
group_by(area, year) |>
summarise(med_density = median(density_n_km2), .groups = "drop") |>
mutate(year = as.numeric(year)) |>
ggplot(aes(year, med_density, color = factor(area))) +
geom_point() +
geom_smooth(se = TRUE, method = "loess", span = 0.6, alpha=.05) +
scale_y_log10(limits = c(NA, 750)) +
labs(
x = "Year",
y = "Median density (numbers / km^2, log scale)",
title = "Median station-level density over time by area",
color = "Area"
) +
ggthemes::theme_few()Densities also vary spatially. The next heatmaps show where hauls were densest, first pooling all years, then grouping by decade to highlight shifts in hotspots.
Code
if (!requireNamespace("maps", quietly = TRUE)) {
stop("Package 'maps' is required for coastline/land layer")
}
dens_map <- station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(
lat_dd = dms_to_dd(shooting_latitude),
lon_dd = dms_to_dd(shooting_longitude) *
ifelse(shooting_quadrant %in% c(7, "7"), -1, 1),
log_density = log10(density_n_km2)
) |>
filter(is.finite(lat_dd), is.finite(lon_dd), is.finite(log_density))
world <- ggplot2::map_data("world") |>
filter(
long > min(dens_map$lon_dd, na.rm = TRUE) - 1,
long < max(dens_map$lon_dd, na.rm = TRUE) + 1,
lat > min(dens_map$lat_dd, na.rm = TRUE) - 1,
lat < max(dens_map$lat_dd, na.rm = TRUE) + 1
)
ggplot() +
geom_polygon(data = world, aes(long, lat, group = group),
fill = "grey90", color = "grey70", linewidth = 0.2) +
stat_summary_2d(data = dens_map, aes(lon_dd, lat_dd, z = log_density),
fun = mean, bins = 35) +
scale_fill_viridis_c(name = "Mean log10 density\n(numbers / km^2)") +
coord_quickmap() +
labs(
x = "Longitude",
y = "Latitude",
title = "Spatial heatmap of log-density (all years combined)"
) +
ggthemes::theme_few()Code
dens_map_decade <- station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(
lat_dd = dms_to_dd(shooting_latitude),
lon_dd = dms_to_dd(shooting_longitude) *
ifelse(shooting_quadrant %in% c(7, "7"), -1, 1),
log_density = log10(density_n_km2),
decade = paste0((year %/% 10) * 10, "s")
) |>
filter(is.finite(lat_dd), is.finite(lon_dd), is.finite(log_density))
ggplot() +
geom_polygon(data = world, aes(long, lat, group = group),
fill = "grey90", color = "grey70", linewidth = 0.2) +
stat_summary_2d(data = dens_map_decade,
aes(lon_dd, lat_dd, z = log_density),
fun = mean, bins = 35) +
scale_fill_viridis_c(name = "Mean log10 density\n(numbers / km^2)") +
coord_quickmap() +
facet_wrap(~ decade) +
labs(
x = "Longitude",
y = "Latitude",
title = "Spatial heatmap of log-density by decade"
) +
ggthemes::theme_few()Parallel plots below translate density weighting into expected length: darker cells indicate areas where hauls contained longer fish on average.
Code
length_map <- hake_len |>
left_join(
station_density |>
select(all_of(join_keys), density_n_km2),
by = join_keys
) |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(
length_cm = length_class / 10,
lon_dd = dms_to_dd(shooting_longitude) *
ifelse(shooting_quadrant %in% c(7, "7"), -1, 1),
lat_dd = dms_to_dd(shooting_latitude)
) |>
group_by(country, area, vessel, year, month, day, haul_number, codend_closing, partit,
lon_dd, lat_dd) |>
summarise(
mean_length_cm = sum(length_cm * nblon * density_n_km2, na.rm = TRUE) /
sum(nblon * density_n_km2, na.rm = TRUE),
.groups = "drop"
) |>
filter(is.finite(mean_length_cm), is.finite(lon_dd), is.finite(lat_dd))
ggplot() +
geom_polygon(data = world, aes(long, lat, group = group),
fill = "grey90", color = "grey70", linewidth = 0.2) +
stat_summary_2d(data = length_map,
aes(lon_dd, lat_dd, z = mean_length_cm),
fun = mean, bins = 35) +
scale_fill_viridis_c(name = "Mean density-weighted length (cm)") +
coord_quickmap() +
labs(
x = "Longitude",
y = "Latitude",
title = "Spatial heatmap of density-weighted mean length (all years combined)"
) +
ggthemes::theme_few()Code
length_map_decade <- length_map |>
filter(mean_length_cm <= 45) |>
mutate(decade = paste0((year %/% 10) * 10, "s"))
ggplot() +
geom_polygon(data = world, aes(long, lat, group = group),
fill = "grey90", color = "grey70", linewidth = 0.2) +
stat_summary_2d(data = length_map_decade,
aes(lon_dd, lat_dd, z = mean_length_cm),
fun = mean, bins = 35) +
scale_fill_viridis_c(name = "Mean density-weighted length (cm)") +
coord_quickmap() +
facet_wrap(~ decade) +
labs(
x = "Longitude",
y = "Latitude",
title = "Spatial heatmap of density-weighted mean length by decade"
) +
ggthemes::theme_few()3.4 Length-class histograms by year
Length distributions help flag shifts toward smaller or larger fish over time. The ridge plot below overlays yearly curves (weighted by counts) to show whether lengths compress or stretch.
Code
hake_len |> filter(length_class<610) |>
mutate(
year = factor(year),
length_class_collapsed = pmin(length_class, 400)
) |>
ggplot(aes(x = length_class_collapsed/10, y = year, weight = nblon, fill = year)) +
geom_density_ridges(scale = 2.0, alpha = 0.7, color = "grey30") +
scale_fill_viridis_d(option = "C") +
labs(
x = "Length class",
y = "Year",
title = "Merluccius length-class distributions by year and area "
) +
theme_minimal() + facet_wrap(~area) +
theme(legend.position = "none")Quarterly density-weighted histograms expose seasonal structure in lengths within each area.
Code
hake_len |>
left_join(station_density |> select(all_of(join_keys), density_n_km2), by = join_keys) |>
filter(is.finite(density_n_km2)) |>
mutate(
quarter = paste0("Q", ceiling(month / 3)),
area = factor(area)
) |>
group_by(area, quarter, length_class) |>
summarise(weighted_n = sum(nblon * density_n_km2, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(length_class, weighted_n)) +
geom_col(fill = "darkolivegreen3", alpha = 0.8) +
facet_grid(area ~ quarter, scales = "free_y") +
labs(
x = "Length class",
y = "Density-weighted count",
title = "Density-weighted length frequencies by area and quarter (all years)"
) +
ggthemes::theme_few()The map below tracks how density-weighted mean haul locations move through time, giving a quick visual on survey footprint.
Code
convert_ddmm <- function(x) {
deg <- floor(x / 100)
min <- x - deg * 100
deg + min / 60
}
loc_weighted <- station_density |>
filter(is.finite(density_n_km2), density_n_km2 > 0) |>
mutate(
lat_dd = convert_ddmm(shooting_latitude),
lon_dd = convert_ddmm(shooting_longitude),
lon_dd = ifelse(shooting_quadrant == 7, -abs(lon_dd), abs(lon_dd))
) |>
group_by(area, year) |>
summarise(
wlat = weighted.mean(lat_dd, density_n_km2, na.rm = TRUE),
wlon = weighted.mean(lon_dd, density_n_km2, na.rm = TRUE),
.groups = "drop"
)
world <- ggplot2::map_data("world")
ggplot() +
geom_polygon(data = world, aes(long, lat, group = group), fill = "grey90", color = "white") +
geom_text(data = loc_weighted, aes(wlon, wlat, label = substr(year, 3, 4), color = factor(area)), nudge_y = 0.0, size = 7) +
geom_path(data = loc_weighted, aes(wlon, wlat, label = year, color = factor(area)), nudge_y = 0.0, size = .2) +
coord_quickmap(xlim = range(loc_weighted$wlon, na.rm = TRUE) + c(-1, 1),
ylim = range(loc_weighted$wlat, na.rm = TRUE) + c(-1, 1)) +
labs(
x = "Longitude",
y = "Latitude",
color = "Area",
title = "Density-weighted mean haul locations by year and area"
) +
ggthemes::theme_few(base_size=14)3.5 Notes on data structure
tbprovides totals by haul;tcprovides the detailed length composition. The joins above focus ontcto visualize distributions.- If multiple vessels/areas are present, further filters (e.g., by
areaorvessel) can be added before plotting.
3.6 Maturity data
Please note this code needs checking.
Code
maturity_len <- hake_len |>
filter(sex == "F") |>
mutate(status = ifelse(maturity != 1, "Mature", "Immature")) |>
filter(!is.na(status))
maturity_len |>
mutate(area = factor(area)) |>
ggplot(aes(length_class, nblon, fill = status)) +
geom_col(position = "identity", alpha = 0.7) +
facet_grid(area ~ ., scales = "free") +
labs(
x = "Length class",
y = "Count",
fill = "Status",
title = "Female Merluccius length frequencies by maturity status and area"
) +
ggthemes::theme_few()Below we track sample sizes and sex composition to gauge representativeness for length–weight work.
Code
hake_len |>
group_by(area, year) |>
summarise(n_length_records = sum(nblon, na.rm = TRUE), .groups = "drop") |>
mutate(area = factor(area)) |>
ggplot(aes(year, n_length_records, color = area, group = area)) +
geom_line() +
geom_point(size = 2) +
expand_limits(y = 0) +
labs(
x = "Year",
y = "Number of length records (nblon)",
color = "Area",
title = "Count of length measurements by year and area (proxy for length–weight data availability)"
) +
ggthemes::theme_few()Code
hake_len |>
filter(!is.na(sex)) |>
mutate(area = factor(area), sex = factor(sex)) |>
group_by(area, sex) |>
summarise(n_length_records = sum(nblon, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(area, n_length_records, fill = sex)) +
geom_col(position = "fill") +
scale_y_continuous(labels = scales::percent_format()) +
labs(
x = "Area",
y = "Sex ratio (share of length records)",
fill = "Sex",
title = "Sex ratio of length records by area (all years)"
) +
ggthemes::theme_few()Code
hake_len |>
filter(!is.na(sex)) |>
mutate(area = factor(area), sex = factor(sex)) |>
group_by(area, sex) |>
summarise(n_length_records = sum(nblon, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(area, n_length_records, fill = sex)) +
geom_col(position = "stack") +
labs(
x = "Area",
y = "Number of length records",
fill = "Sex",
title = "Counts of length records by sex and area (all years)"
) +
ggthemes::theme_few()Code
hake_len |>
filter(!is.na(sex)) |>
mutate(area = factor(area), sex = factor(sex)) |>
group_by(area, year, sex) |>
summarise(n_length_records = sum(nblon, na.rm = TRUE), .groups = "drop") |>
group_by(area, year) |>
mutate(total = sum(n_length_records)) |>
ungroup() |>
ggplot(aes(year, n_length_records / total, fill = sex)) +
geom_col(position = "stack") +
facet_wrap(~ area, nrow = 1) +
scale_y_continuous(labels = scales::percent_format(), limits = c(0, 1)) +
labs(
x = "Year",
y = "Sex ratio (share of length records)",
fill = "Sex",
title = "Sex ratio of length records by year and area"
) +
ggthemes::theme_few()Code
hake_len |>
filter(!is.na(sex)) |>
mutate(area = factor(area), sex = factor(sex)) |>
group_by(area, year, sex) |>
summarise(n_length_records = sum(nblon, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(year, n_length_records, fill = sex)) +
geom_col(position = "stack") +
facet_grid( area~., scales = "free_y") +
labs(
x = "Year",
y = "Number of length records",
fill = "Sex",
title = "Counts of length records by sex, year, and area"
) +
ggthemes::theme_few()Weight-at-length is summarized directly from available weights; if raw weights are absent, values fall back to the length–weight relationship to keep the plot populated.
Code
hake_len |>
mutate(
area = factor(area),
weight_kg = if ("weight" %in% names(hake_len)) {
weight
} else {
6.7e-6 * (length_class / 10) ^ 3.035
}
) |>
group_by(area, length_class) |>
summarise(mean_weight = mean(weight_kg, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(length_class, mean_weight, color = area)) +
geom_line() +
labs(
x = "Length class",
y = "Mean weight (kg)",
color = "Area",
title = "Weight-at-length across all years by area (raw where available, otherwise derived)"
) +
ggthemes::theme_few()3.7 Survey activity by area, month, and year
Code
ta |>
mutate(month = factor(month, levels = 1:12),
year = factor(year, levels = sort(unique(year), decreasing = FALSE)),
area = factor(area)) |>
distinct(area, year, month, haul_number) |>
ggplot(aes(month, year, fill = area)) +
geom_tile(color = "white") +
scale_y_discrete(limits = rev) +
labs(
x = "Month",
y = "Year (most recent at bottom)",
fill = "Area",
title = "MEDITs ESP survey activity by area, month, and year"
) +
theme_minimal()Code
ta |>
mutate(month = factor(month, levels = 1:12),
area = factor(area)) |>
count(area, month, name = "n_hauls") |>
ggplot(aes(month, n_hauls, fill = area)) +
geom_col(position = "dodge") +
labs(
x = "Month",
y = "Number of hauls (all years)",
fill = "Area",
title = "Distribution of tows by month and area (all years)"
) +
theme_minimal()Code
ta |>
mutate(month = factor(month, levels = 1:12),
area = factor(area),
year = factor(year)) |>
count(year, area, month, name = "n_hauls") |>
ggplot(aes(month, n_hauls, fill = area)) +
geom_col(position = "dodge") +
facet_wrap(~ year ) +
labs(
x = "Month",
y = "Number of hauls",
fill = "Area",
title = "Tows by month and area, faceted by year"
) +
theme_minimal()Code
# Time series of number of fish measured (length records) by year
hake_len |>
group_by(area, year) |>
summarise(total_measured = sum(nblon, na.rm = TRUE), .groups = "drop") |>
ggplot(aes(year, total_measured, color = factor(area), group = area)) +
geom_line() +
geom_point() +
expand_limits(y = 0) +
labs(
x = "Year",
y = "Number of fish measured (Merluccius)",
title = "Time series of measured fish by year and area",
color = "Area"
) +
theme_minimal()4 Consideration of other CPUE indices
To complement the survey indices, we can explore fishery-dependent signals using trawl effort and catch as an approximate nominal CPUE series shared in ITCAM GSA6 presentations. The following illustrative plots normalize effort and catch, then compare catch per horsepower and catch per TRB as simple proxies. These series should be treated cautiously—they are not standardized and reflect indicative trends only.
Code
dat <- qread("catch_effort.qs")
# Add 4 to columns 2–4 and normalize those columns by their 10th-row values
cols_adj <- 2:4
dat_norm <- dat
dat_norm[, cols_adj] <- dat_norm[, cols_adj] + 4
norm_vals <- dat_norm[11, cols_adj] |> as.numeric()
dat_norm[, cols_adj] <- sweep(dat_norm[, cols_adj], 2, norm_vals, "/")
# Time series of normalized catch and effort indicators
dat_long <- dat_norm |>
pivot_longer(-Year, names_to = "series", values_to = "value")
ggplot(dat_long, aes(Year, value, color = series)) +
geom_line() +
geom_point(size = 2) +
labs(
x = "Year",
y = "Normalized value",
title = "Catch and effort indicators over time (normalized)",
color = "Series"
) +
theme_minimal()Code
# CPUE proxies: catch per horsepower and per TRB
dat_ratios <- dat_norm |>
mutate(
catch_per_HP = Catch / HP,
catch_per_TRB = Catch / TRB,
) |>
select(Year, catch_per_TRB, catch_per_HP) |>
pivot_longer(-Year, names_to = "ratio", values_to = "value")
ggplot(dat_ratios, aes(Year, value, color = ratio)) +
geom_line() +
geom_point(size = 2) +
labs(
x = "Year",
y = "HKE catch / effort unit",
title = "CPUE proxies: catch per horsepower vs catch per TRB",
color = "Ratio"
) +
theme_minimal()